Markovian Discrimination
   HOME

TheInfoList



OR:

Within the
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
Markov model In probability theory, a Markov model is a stochastic model used to Mathematical model, model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, i ...
, Markovian discrimination in spam filtering is a method used in
CRM114 The CRM 114 Discriminator is a fictional piece of radio equipment in Stanley Kubrick's film ''Dr. Strangelove'' (1964), the destruction of which prevents the crew of a B-52 from receiving the recall code that would stop them from dropping their ...
and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple
Bayesian methods Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and e ...
. A simple Bayesian model of written text contains only the dictionary of legal words and their relative probabilities. A Markovian model adds the relative transition probabilities that given one word, predict what the next word will be. It is based on the theory of
Markov chain A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happe ...
s by
Andrey Markov Andrey Andreyevich Markov, first name also spelled "Andrei", in older works also spelled Markoff) (14 June 1856 – 20 July 1922) was a Russian mathematician best known for his work on stochastic processes. A primary subject of his research lat ...
, hence the name. In essence, a Bayesian filter works on single words alone, while a Markovian filter works on phrases or entire sentences. There are two types of
Markov models In probability theory, a Markov model is a stochastic model used to model pseudo-randomly changing systems. It is assumed that future states depend only on the current state, not on the events that occurred before it (that is, it assumes the Marko ...
; the visible Markov model, and the
hidden Markov model A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it X — with unobservable ("''hidden''") states. As part of the definition, HMM requires that there be an ob ...
or HMM. The difference is that with a visible Markov model, the current word is considered to contain the entire state of the language model, while a hidden Markov model hides the state and presumes only that the current word is probabilistically related to the actual internal state of the language. For example, in a visible Markov model the word "the" should predict with accuracy the following word, while in a hidden Markov model, the entire prior text implies the actual state and predicts the following words, but does not actually guarantee that state or prediction. Since the latter case is what's encountered in spam filtering, hidden Markov models are almost always used. In particular, because of storage limitations, the specific type of hidden Markov model called a
Markov random field In the domain of physics and probability, a Markov random field (MRF), Markov network or undirected graphical model is a set of random variables having a Markov property described by an undirected graph. In other words, a random field is said to b ...
is particularly applicable, usually with a clique size of between four and six tokens.


See also

*
Maximum-entropy Markov model In statistics, a maximum-entropy Markov model (MEMM), or conditional Markov model (CMM), is a graphical model for sequence labeling that combines features of hidden Markov models (HMMs) and maximum entropy (MaxEnt) models. An MEMM is a discrimina ...


References

* Chhabra, S., Yerazunis, W. S., and Siefkes, C. 2004. ''Spam Filtering using a Markov Random Field Model with Variable Weighting Schemas.'' In Proceedings of the Fourth IEEE international Conference on Data Mining (November 1–04, 2004). ICDM. IEEE Computer Society, Washington, DC, Mazharul {{DEFAULTSORT:Markovian Discrimination Spam filtering Markov models Statistical natural language processing